Overview
Brought to you by YData
Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 5000 |
| Missing cells | 8461 |
| Missing cells (%) | 14.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 468.9 KiB |
| Average record size in memory | 96.0 B |
Variable types
| Numeric | 4 |
|---|---|
| Text | 6 |
| Categorical | 2 |
customer_id is highly overall correlated with gender and 1 other fields | High correlation |
gender is highly overall correlated with customer_id and 1 other fields | High correlation |
marital_status is highly overall correlated with customer_id and 1 other fields | High correlation |
tax_id is highly overall correlated with gender and 1 other fields | High correlation |
nin has 2062 (41.2%) missing values | Missing |
passport has 2481 (49.6%) missing values | Missing |
drivers_license has 999 (20.0%) missing values | Missing |
voters_card has 1001 (20.0%) missing values | Missing |
tax_id has 952 (19.0%) missing values | Missing |
cac_number has 966 (19.3%) missing values | Missing |
customer_id has unique values | Unique |
Reproduction
| Analysis started | 2025-01-16 11:36:40.643159 |
|---|---|
| Analysis finished | 2025-01-16 11:54:12.084108 |
| Duration | 17 minutes and 31.44 seconds |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
Variables
customer_id
Real number (ℝ)
High correlation  Unique 
| Distinct | 5000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.486769 × 1010 |
| Minimum | 1.0027741 × 1010 |
|---|---|
| Maximum | 9.9997026 × 1010 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.2 KiB |
Quantile statistics
| Minimum | 1.0027741 × 1010 |
|---|---|
| 5-th percentile | 1.4736439 × 1010 |
| Q1 | 3.2364211 × 1010 |
| median | 5.4868531 × 1010 |
| Q3 | 7.716611 × 1010 |
| 95-th percentile | 9.5623328 × 1010 |
| Maximum | 9.9997026 × 1010 |
| Range | 8.9969285 × 1010 |
| Interquartile range (IQR) | 4.4801899 × 1010 |
Descriptive statistics
| Standard deviation | 2.5988812 × 1010 |
|---|---|
| Coefficient of variation (CV) | 0.47366332 |
| Kurtosis | -1.1957044 |
| Mean | 5.486769 × 1010 |
| Median Absolute Deviation (MAD) | 2.2419024 × 1010 |
| Skewness | 0.0022546826 |
| Sum | 2.7433845 × 1014 |
| Variance | 6.7541834 × 1020 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.247409821 × 1010 | 1 | < 0.1% |
| 8.972423399 × 1010 | 1 | < 0.1% |
| 7.129482598 × 1010 | 1 | < 0.1% |
| 9.984909754 × 1010 | 1 | < 0.1% |
| 3.279537511 × 1010 | 1 | < 0.1% |
| 9.6176223 × 1010 | 1 | < 0.1% |
| 7.736425226 × 1010 | 1 | < 0.1% |
| 9.573946874 × 1010 | 1 | < 0.1% |
| 1.302285831 × 1010 | 1 | < 0.1% |
| 9.833272018 × 1010 | 1 | < 0.1% |
| Other values (4990) | 4990 |
| Value | Count | Frequency (%) |
| 1.002774079 × 1010 | 1 | |
| 1.003222352 × 1010 | 1 | |
| 1.003661683 × 1010 | 1 | |
| 1.005587181 × 1010 | 1 | |
| 1.006153475 × 1010 | 1 | |
| 1.007176375 × 1010 | 1 | |
| 1.00747918 × 1010 | 1 | |
| 1.010684038 × 1010 | 1 | |
| 1.018308557 × 1010 | 1 | |
| 1.024259789 × 1010 | 1 |
| Value | Count | Frequency (%) |
| 9.999702598 × 1010 | 1 | |
| 9.998872 × 1010 | 1 | |
| 9.997194328 × 1010 | 1 | |
| 9.992913142 × 1010 | 1 | |
| 9.991784745 × 1010 | 1 | |
| 9.990584014 × 1010 | 1 | |
| 9.988322142 × 1010 | 1 | |
| 9.988153451 × 1010 | 1 | |
| 9.987927393 × 1010 | 1 | |
| 9.987727271 × 1010 | 1 |
occupation
Text
| Distinct | 638 |
|---|---|
| Distinct (%) | 12.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.2 KiB |
Length
| Max length | 59 |
|---|---|
| Median length | 40 |
| Mean length | 20.6998 |
| Min length | 3 |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Equality and diversity officer |
|---|---|
| 2nd row | Occupational therapist |
| 3rd row | Physiotherapist |
| 4th row | Psychologist, occupational |
| 5th row | Hydrologist |
| Value | Count | Frequency (%) |
| officer | 439 | 3.8% |
| engineer | 421 | 3.7% |
| manager | 417 | 3.6% |
| scientist | 212 | 1.9% |
| designer | 204 | 1.8% |
| surveyor | 169 | 1.5% |
| and | 157 | 1.4% |
| education | 130 | 1.1% |
| therapist | 121 | 1.1% |
| teacher | 115 | 1.0% |
| Other values (524) | 9071 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 10381 | 10.0% |
| i | 9136 | 8.8% |
| r | 8719 | 8.4% |
| a | 7761 | 7.5% |
| t | 7255 | 7.0% |
| n | 7084 | 6.8% |
| 6456 | 6.2% | |
| o | 5991 | 5.8% |
| s | 5576 | 5.4% |
| c | 5027 | 4.9% |
| Other values (44) | 30113 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 103499 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 10381 | 10.0% |
| i | 9136 | 8.8% |
| r | 8719 | 8.4% |
| a | 7761 | 7.5% |
| t | 7255 | 7.0% |
| n | 7084 | 6.8% |
| 6456 | 6.2% | |
| o | 5991 | 5.8% |
| s | 5576 | 5.4% |
| c | 5027 | 4.9% |
| Other values (44) | 30113 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 103499 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 10381 | 10.0% |
| i | 9136 | 8.8% |
| r | 8719 | 8.4% |
| a | 7761 | 7.5% |
| t | 7255 | 7.0% |
| n | 7084 | 6.8% |
| 6456 | 6.2% | |
| o | 5991 | 5.8% |
| s | 5576 | 5.4% |
| c | 5027 | 4.9% |
| Other values (44) | 30113 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 103499 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 10381 | 10.0% |
| i | 9136 | 8.8% |
| r | 8719 | 8.4% |
| a | 7761 | 7.5% |
| t | 7255 | 7.0% |
| n | 7084 | 6.8% |
| 6456 | 6.2% | |
| o | 5991 | 5.8% |
| s | 5576 | 5.4% |
| c | 5027 | 4.9% |
| Other values (44) | 30113 |
nin
Real number (ℝ)
Missing 
| Distinct | 2938 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 2062 |
| Missing (%) | 41.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.3994607 × 1010 |
| Minimum | 1.003104 × 1010 |
|---|---|
| Maximum | 9.9987051 × 1010 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.2 KiB |
Quantile statistics
| Minimum | 1.003104 × 1010 |
|---|---|
| 5-th percentile | 1.3980859 × 1010 |
| Q1 | 3.1371027 × 1010 |
| median | 5.3718349 × 1010 |
| Q3 | 7.6632174 × 1010 |
| 95-th percentile | 9.5309973 × 1010 |
| Maximum | 9.9987051 × 1010 |
| Range | 8.9956011 × 1010 |
| Interquartile range (IQR) | 4.5261147 × 1010 |
Descriptive statistics
| Standard deviation | 2.608408 × 1010 |
|---|---|
| Coefficient of variation (CV) | 0.48308676 |
| Kurtosis | -1.2139339 |
| Mean | 5.3994607 × 1010 |
| Median Absolute Deviation (MAD) | 2.2604135 × 1010 |
| Skewness | 0.040599515 |
| Sum | 1.5863616 × 1014 |
| Variance | 6.803792 × 1020 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8.383041664 × 1010 | 1 | < 0.1% |
| 9.21698988 × 1010 | 1 | < 0.1% |
| 6.914023413 × 1010 | 1 | < 0.1% |
| 8.754477936 × 1010 | 1 | < 0.1% |
| 6.542377583 × 1010 | 1 | < 0.1% |
| 5.539080371 × 1010 | 1 | < 0.1% |
| 3.783526885 × 1010 | 1 | < 0.1% |
| 8.312932412 × 1010 | 1 | < 0.1% |
| 1.277674156 × 1010 | 1 | < 0.1% |
| 7.972256754 × 1010 | 1 | < 0.1% |
| Other values (2928) | 2928 | |
| (Missing) | 2062 |
| Value | Count | Frequency (%) |
| 1.003104043 × 1010 | 1 | |
| 1.00372131 × 1010 | 1 | |
| 1.006946883 × 1010 | 1 | |
| 1.008788974 × 1010 | 1 | |
| 1.009964581 × 1010 | 1 | |
| 1.010775923 × 1010 | 1 | |
| 1.010875228 × 1010 | 1 | |
| 1.015564269 × 1010 | 1 | |
| 1.019120062 × 1010 | 1 | |
| 1.025798237 × 1010 | 1 |
| Value | Count | Frequency (%) |
| 9.998705094 × 1010 | 1 | |
| 9.9942715 × 1010 | 1 | |
| 9.990515697 × 1010 | 1 | |
| 9.98944566 × 1010 | 1 | |
| 9.978981024 × 1010 | 1 | |
| 9.976167637 × 1010 | 1 | |
| 9.973182168 × 1010 | 1 | |
| 9.971547993 × 1010 | 1 | |
| 9.949619705 × 1010 | 1 | |
| 9.946956435 × 1010 | 1 |
passport
Text
Missing 
| Distinct | 2519 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 2481 |
| Missing (%) | 49.6% |
| Memory size | 39.2 KiB |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.5204446 |
| Min length | 8 |
Unique
| Unique | 2519 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | XO6610220 |
|---|---|
| 2nd row | MK8799712 |
| 3rd row | UZ2368406 |
| 4th row | F0385929 |
| 5th row | V2133231 |
| Value | Count | Frequency (%) |
| wm3915519 | 1 | < 0.1% |
| be2758840 | 1 | < 0.1% |
| qe8551565 | 1 | < 0.1% |
| hi3522339 | 1 | < 0.1% |
| f5805501 | 1 | < 0.1% |
| i7403791 | 1 | < 0.1% |
| bg0790627 | 1 | < 0.1% |
| m3815892 | 1 | < 0.1% |
| w8863959 | 1 | < 0.1% |
| os8011310 | 1 | < 0.1% |
| Other values (2509) | 2509 |
Most occurring characters
| Value | Count | Frequency (%) |
| 8 | 1866 | |
| 4 | 1836 | |
| 9 | 1803 | |
| 5 | 1775 | |
| 1 | 1773 | |
| 3 | 1766 | |
| 7 | 1761 | |
| 2 | 1697 | |
| 6 | 1687 | |
| 0 | 1669 | |
| Other values (26) | 3830 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 21463 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 8 | 1866 | |
| 4 | 1836 | |
| 9 | 1803 | |
| 5 | 1775 | |
| 1 | 1773 | |
| 3 | 1766 | |
| 7 | 1761 | |
| 2 | 1697 | |
| 6 | 1687 | |
| 0 | 1669 | |
| Other values (26) | 3830 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 21463 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 8 | 1866 | |
| 4 | 1836 | |
| 9 | 1803 | |
| 5 | 1775 | |
| 1 | 1773 | |
| 3 | 1766 | |
| 7 | 1761 | |
| 2 | 1697 | |
| 6 | 1687 | |
| 0 | 1669 | |
| Other values (26) | 3830 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 21463 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 8 | 1866 | |
| 4 | 1836 | |
| 9 | 1803 | |
| 5 | 1775 | |
| 1 | 1773 | |
| 3 | 1766 | |
| 7 | 1761 | |
| 2 | 1697 | |
| 6 | 1687 | |
| 0 | 1669 | |
| Other values (26) | 3830 |
country_of_birth
Text
| Distinct | 243 |
|---|---|
| Distinct (%) | 4.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.2 KiB |
Length
| Max length | 51 |
|---|---|
| Median length | 33 |
| Mean length | 10.7664 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Malaysia |
|---|---|
| 2nd row | American Samoa |
| 3rd row | Cook Islands |
| 4th row | Mauritius |
| 5th row | Zimbabwe |
| Value | Count | Frequency (%) |
| islands | 357 | 4.6% |
| and | 243 | 3.1% |
| saint | 158 | 2.0% |
| republic | 139 | 1.8% |
| united | 105 | 1.3% |
| island | 95 | 1.2% |
| south | 78 | 1.0% |
| french | 72 | 0.9% |
| states | 63 | 0.8% |
| the | 63 | 0.8% |
| Other values (298) | 6407 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 7364 | 13.7% |
| n | 4454 | 8.3% |
| i | 4300 | 8.0% |
| e | 3713 | 6.9% |
| r | 3088 | 5.7% |
| 2780 | 5.2% | |
| o | 2664 | 4.9% |
| t | 2281 | 4.2% |
| l | 2261 | 4.2% |
| s | 2209 | 4.1% |
| Other values (49) | 18718 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 53832 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 7364 | 13.7% |
| n | 4454 | 8.3% |
| i | 4300 | 8.0% |
| e | 3713 | 6.9% |
| r | 3088 | 5.7% |
| 2780 | 5.2% | |
| o | 2664 | 4.9% |
| t | 2281 | 4.2% |
| l | 2261 | 4.2% |
| s | 2209 | 4.1% |
| Other values (49) | 18718 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 53832 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 7364 | 13.7% |
| n | 4454 | 8.3% |
| i | 4300 | 8.0% |
| e | 3713 | 6.9% |
| r | 3088 | 5.7% |
| 2780 | 5.2% | |
| o | 2664 | 4.9% |
| t | 2281 | 4.2% |
| l | 2261 | 4.2% |
| s | 2209 | 4.1% |
| Other values (49) | 18718 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 53832 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 7364 | 13.7% |
| n | 4454 | 8.3% |
| i | 4300 | 8.0% |
| e | 3713 | 6.9% |
| r | 3088 | 5.7% |
| 2780 | 5.2% | |
| o | 2664 | 4.9% |
| t | 2281 | 4.2% |
| l | 2261 | 4.2% |
| s | 2209 | 4.1% |
| Other values (49) | 18718 |
marital_status
Categorical
High correlation 
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.2 KiB |
| Single | |
|---|---|
| Married | |
| Widowed |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 6.6564 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Single |
|---|---|
| 2nd row | Widowed |
| 3rd row | Married |
| 4th row | Married |
| 5th row | Single |
Common Values
| Value | Count | Frequency (%) |
| Single | 1718 | |
| Married | 1698 | |
| Widowed | 1584 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| single | 1718 | |
| married | 1698 | |
| widowed | 1584 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 5000 | |
| e | 5000 | |
| d | 4866 | |
| r | 3396 | |
| S | 1718 | 5.2% |
| l | 1718 | 5.2% |
| g | 1718 | 5.2% |
| n | 1718 | 5.2% |
| M | 1698 | 5.1% |
| a | 1698 | 5.1% |
| Other values (3) | 4752 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 33282 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 5000 | |
| e | 5000 | |
| d | 4866 | |
| r | 3396 | |
| S | 1718 | 5.2% |
| l | 1718 | 5.2% |
| g | 1718 | 5.2% |
| n | 1718 | 5.2% |
| M | 1698 | 5.1% |
| a | 1698 | 5.1% |
| Other values (3) | 4752 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 33282 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 5000 | |
| e | 5000 | |
| d | 4866 | |
| r | 3396 | |
| S | 1718 | 5.2% |
| l | 1718 | 5.2% |
| g | 1718 | 5.2% |
| n | 1718 | 5.2% |
| M | 1698 | 5.1% |
| a | 1698 | 5.1% |
| Other values (3) | 4752 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 33282 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 5000 | |
| e | 5000 | |
| d | 4866 | |
| r | 3396 | |
| S | 1718 | 5.2% |
| l | 1718 | 5.2% |
| g | 1718 | 5.2% |
| n | 1718 | 5.2% |
| M | 1698 | 5.1% |
| a | 1698 | 5.1% |
| Other values (3) | 4752 |
drivers_license
Text
Missing 
| Distinct | 4001 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 999 |
| Missing (%) | 20.0% |
| Memory size | 39.2 KiB |
Length
| Max length | 17 |
|---|---|
| Median length | 17 |
| Mean length | 17 |
| Min length | 17 |
Unique
| Unique | 4001 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | U-ILUAQ44-1396390 |
|---|---|
| 2nd row | L-PRCHX61-1885792 |
| 3rd row | O-XOJTJ30-0791028 |
| 4th row | P-ZEZIS45-1063384 |
| 5th row | X-LNUWH93-5717866 |
| Value | Count | Frequency (%) |
| p-rafgd72-9104520 | 1 | < 0.1% |
| v-tfefc51-9856888 | 1 | < 0.1% |
| e-yggvj59-3936940 | 1 | < 0.1% |
| y-ujkrp79-2105588 | 1 | < 0.1% |
| f-nnnuz17-1389761 | 1 | < 0.1% |
| s-ilqju98-9461441 | 1 | < 0.1% |
| b-ggdto43-1463479 | 1 | < 0.1% |
| q-occaa62-7022888 | 1 | < 0.1% |
| j-trytm56-9182793 | 1 | < 0.1% |
| l-acknv01-8573658 | 1 | < 0.1% |
| Other values (3991) | 3991 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 8002 | 11.8% |
| 1 | 3690 | 5.4% |
| 9 | 3649 | 5.4% |
| 3 | 3646 | 5.4% |
| 7 | 3601 | 5.3% |
| 5 | 3598 | 5.3% |
| 8 | 3585 | 5.3% |
| 2 | 3573 | 5.3% |
| 0 | 3561 | 5.2% |
| 4 | 3556 | 5.2% |
| Other values (27) | 27556 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 68017 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| - | 8002 | 11.8% |
| 1 | 3690 | 5.4% |
| 9 | 3649 | 5.4% |
| 3 | 3646 | 5.4% |
| 7 | 3601 | 5.3% |
| 5 | 3598 | 5.3% |
| 8 | 3585 | 5.3% |
| 2 | 3573 | 5.3% |
| 0 | 3561 | 5.2% |
| 4 | 3556 | 5.2% |
| Other values (27) | 27556 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 68017 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| - | 8002 | 11.8% |
| 1 | 3690 | 5.4% |
| 9 | 3649 | 5.4% |
| 3 | 3646 | 5.4% |
| 7 | 3601 | 5.3% |
| 5 | 3598 | 5.3% |
| 8 | 3585 | 5.3% |
| 2 | 3573 | 5.3% |
| 0 | 3561 | 5.2% |
| 4 | 3556 | 5.2% |
| Other values (27) | 27556 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 68017 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| - | 8002 | 11.8% |
| 1 | 3690 | 5.4% |
| 9 | 3649 | 5.4% |
| 3 | 3646 | 5.4% |
| 7 | 3601 | 5.3% |
| 5 | 3598 | 5.3% |
| 8 | 3585 | 5.3% |
| 2 | 3573 | 5.3% |
| 0 | 3561 | 5.2% |
| 4 | 3556 | 5.2% |
| Other values (27) | 27556 |
voters_card
Text
Missing 
| Distinct | 3999 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 1001 |
| Missing (%) | 20.0% |
| Memory size | 39.2 KiB |
Length
| Max length | 16 |
|---|---|
| Median length | 16 |
| Mean length | 16 |
| Min length | 16 |
Unique
| Unique | 3999 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | XPY/186/07/20045 |
|---|---|
| 2nd row | MSA/775/26/44500 |
| 3rd row | BKQ/568/35/91728 |
| 4th row | LIA/225/36/59688 |
| 5th row | VIC/056/69/83740 |
| Value | Count | Frequency (%) |
| mbr/062/12/29701 | 1 | < 0.1% |
| pue/451/15/31868 | 1 | < 0.1% |
| htg/095/02/17329 | 1 | < 0.1% |
| fki/123/32/16265 | 1 | < 0.1% |
| cha/802/00/59019 | 1 | < 0.1% |
| dtb/263/46/30656 | 1 | < 0.1% |
| oay/198/97/37923 | 1 | < 0.1% |
| rke/290/24/12801 | 1 | < 0.1% |
| bpn/837/80/85131 | 1 | < 0.1% |
| cdo/939/29/41441 | 1 | < 0.1% |
| Other values (3989) | 3989 |
Most occurring characters
| Value | Count | Frequency (%) |
| / | 11997 | |
| 6 | 4164 | 6.5% |
| 7 | 4069 | 6.4% |
| 0 | 4056 | 6.3% |
| 2 | 4027 | 6.3% |
| 8 | 3982 | 6.2% |
| 9 | 3980 | 6.2% |
| 3 | 3970 | 6.2% |
| 5 | 3940 | 6.2% |
| 4 | 3916 | 6.1% |
| Other values (27) | 15883 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 63984 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| / | 11997 | |
| 6 | 4164 | 6.5% |
| 7 | 4069 | 6.4% |
| 0 | 4056 | 6.3% |
| 2 | 4027 | 6.3% |
| 8 | 3982 | 6.2% |
| 9 | 3980 | 6.2% |
| 3 | 3970 | 6.2% |
| 5 | 3940 | 6.2% |
| 4 | 3916 | 6.1% |
| Other values (27) | 15883 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 63984 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| / | 11997 | |
| 6 | 4164 | 6.5% |
| 7 | 4069 | 6.4% |
| 0 | 4056 | 6.3% |
| 2 | 4027 | 6.3% |
| 8 | 3982 | 6.2% |
| 9 | 3980 | 6.2% |
| 3 | 3970 | 6.2% |
| 5 | 3940 | 6.2% |
| 4 | 3916 | 6.1% |
| Other values (27) | 15883 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 63984 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| / | 11997 | |
| 6 | 4164 | 6.5% |
| 7 | 4069 | 6.4% |
| 0 | 4056 | 6.3% |
| 2 | 4027 | 6.3% |
| 8 | 3982 | 6.2% |
| 9 | 3980 | 6.2% |
| 3 | 3970 | 6.2% |
| 5 | 3940 | 6.2% |
| 4 | 3916 | 6.1% |
| Other values (27) | 15883 |
tax_id
Real number (ℝ)
High correlation  Missing 
| Distinct | 4048 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 952 |
| Missing (%) | 19.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.0033818 × 109 |
| Minimum | 2116573 |
|---|---|
| Maximum | 9.9951354 × 109 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.2 KiB |
Quantile statistics
| Minimum | 2116573 |
|---|---|
| 5-th percentile | 4.8036053 × 108 |
| Q1 | 2.5513923 × 109 |
| median | 4.9964317 × 109 |
| Q3 | 7.4661577 × 109 |
| 95-th percentile | 9.5221473 × 109 |
| Maximum | 9.9951354 × 109 |
| Range | 9.9930188 × 109 |
| Interquartile range (IQR) | 4.9147655 × 109 |
Descriptive statistics
| Standard deviation | 2.8622933 × 109 |
|---|---|
| Coefficient of variation (CV) | 0.57207174 |
| Kurtosis | -1.1621938 |
| Mean | 5.0033818 × 109 |
| Median Absolute Deviation (MAD) | 2.4556884 × 109 |
| Skewness | 0.0005591873 |
| Sum | 2.025369 × 1013 |
| Variance | 8.1927232 × 1018 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8988517976 | 1 | < 0.1% |
| 6088010554 | 1 | < 0.1% |
| 6862142259 | 1 | < 0.1% |
| 3648292346 | 1 | < 0.1% |
| 1681918254 | 1 | < 0.1% |
| 5778751994 | 1 | < 0.1% |
| 9514932667 | 1 | < 0.1% |
| 4892378526 | 1 | < 0.1% |
| 7304356593 | 1 | < 0.1% |
| 8518911432 | 1 | < 0.1% |
| Other values (4038) | 4038 | |
| (Missing) | 952 | 19.0% |
| Value | Count | Frequency (%) |
| 2116573 | 1 | |
| 4180345 | 1 | |
| 7472042 | 1 | |
| 9873508 | 1 | |
| 11642587 | 1 | |
| 12567432 | 1 | |
| 17285827 | 1 | |
| 26098980 | 1 | |
| 27812918 | 1 | |
| 36794721 | 1 |
| Value | Count | Frequency (%) |
| 9995135373 | 1 | |
| 9994960899 | 1 | |
| 9994043807 | 1 | |
| 9991878335 | 1 | |
| 9987801168 | 1 | |
| 9986881900 | 1 | |
| 9983897102 | 1 | |
| 9979310343 | 1 | |
| 9976985254 | 1 | |
| 9975649558 | 1 |
cac_number
Text
Missing 
| Distinct | 4034 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 966 |
| Missing (%) | 19.3% |
| Memory size | 39.2 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Unique
| Unique | 4034 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | BN/6363282 |
|---|---|
| 2nd row | BN/7765086 |
| 3rd row | RC/7145711 |
| 4th row | RC/7292868 |
| 5th row | BN/8508999 |
| Value | Count | Frequency (%) |
| rc/1534651 | 1 | < 0.1% |
| rc/8422783 | 1 | < 0.1% |
| bn/2308183 | 1 | < 0.1% |
| rc/0626422 | 1 | < 0.1% |
| rc/2419496 | 1 | < 0.1% |
| bn/4813609 | 1 | < 0.1% |
| rc/3275050 | 1 | < 0.1% |
| rc/2338970 | 1 | < 0.1% |
| bn/5503072 | 1 | < 0.1% |
| bn/4344227 | 1 | < 0.1% |
| Other values (4024) | 4024 |
Most occurring characters
| Value | Count | Frequency (%) |
| / | 4034 | 10.0% |
| 7 | 2905 | 7.2% |
| 5 | 2884 | 7.1% |
| 1 | 2869 | 7.1% |
| 2 | 2869 | 7.1% |
| 0 | 2803 | 6.9% |
| 6 | 2802 | 6.9% |
| 4 | 2798 | 6.9% |
| 9 | 2778 | 6.9% |
| 8 | 2768 | 6.9% |
| Other values (5) | 10830 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 40340 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| / | 4034 | 10.0% |
| 7 | 2905 | 7.2% |
| 5 | 2884 | 7.1% |
| 1 | 2869 | 7.1% |
| 2 | 2869 | 7.1% |
| 0 | 2803 | 6.9% |
| 6 | 2802 | 6.9% |
| 4 | 2798 | 6.9% |
| 9 | 2778 | 6.9% |
| 8 | 2768 | 6.9% |
| Other values (5) | 10830 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 40340 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| / | 4034 | 10.0% |
| 7 | 2905 | 7.2% |
| 5 | 2884 | 7.1% |
| 1 | 2869 | 7.1% |
| 2 | 2869 | 7.1% |
| 0 | 2803 | 6.9% |
| 6 | 2802 | 6.9% |
| 4 | 2798 | 6.9% |
| 9 | 2778 | 6.9% |
| 8 | 2768 | 6.9% |
| Other values (5) | 10830 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 40340 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| / | 4034 | 10.0% |
| 7 | 2905 | 7.2% |
| 5 | 2884 | 7.1% |
| 1 | 2869 | 7.1% |
| 2 | 2869 | 7.1% |
| 0 | 2803 | 6.9% |
| 6 | 2802 | 6.9% |
| 4 | 2798 | 6.9% |
| 9 | 2778 | 6.9% |
| 8 | 2768 | 6.9% |
| Other values (5) | 10830 |
gender
Categorical
High correlation 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.2 KiB |
| M | |
|---|---|
| F |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M |
|---|---|
| 2nd row | F |
| 3rd row | F |
| 4th row | F |
| 5th row | F |
Common Values
| Value | Count | Frequency (%) |
| M | 2501 | |
| F | 2499 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| m | 2501 | |
| f | 2499 |
Most occurring characters
| Value | Count | Frequency (%) |
| M | 2501 | |
| F | 2499 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 5000 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| M | 2501 | |
| F | 2499 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 5000 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| M | 2501 | |
| F | 2499 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 5000 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| M | 2501 | |
| F | 2499 |
postal_code
Real number (ℝ)
| Distinct | 4873 |
|---|---|
| Distinct (%) | 97.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 50756.144 |
| Minimum | 522 |
|---|---|
| Maximum | 99892 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.2 KiB |
Quantile statistics
| Minimum | 522 |
|---|---|
| 5-th percentile | 5288.2 |
| Q1 | 25633 |
| median | 50960 |
| Q3 | 75853.5 |
| 95-th percentile | 94850.25 |
| Maximum | 99892 |
| Range | 99370 |
| Interquartile range (IQR) | 50220.5 |
Descriptive statistics
| Standard deviation | 28914.729 |
|---|---|
| Coefficient of variation (CV) | 0.56967939 |
| Kurtosis | -1.2134979 |
| Mean | 50756.144 |
| Median Absolute Deviation (MAD) | 25096 |
| Skewness | -0.028345452 |
| Sum | 2.5378072 × 108 |
| Variance | 8.3606157 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 73082 | 3 | 0.1% |
| 55165 | 3 | 0.1% |
| 85830 | 3 | 0.1% |
| 92896 | 3 | 0.1% |
| 62782 | 2 | < 0.1% |
| 46707 | 2 | < 0.1% |
| 66802 | 2 | < 0.1% |
| 10029 | 2 | < 0.1% |
| 70993 | 2 | < 0.1% |
| 73928 | 2 | < 0.1% |
| Other values (4863) | 4976 |
| Value | Count | Frequency (%) |
| 522 | 1 | |
| 562 | 1 | |
| 563 | 1 | |
| 565 | 1 | |
| 604 | 2 | |
| 615 | 1 | |
| 629 | 1 | |
| 648 | 1 | |
| 661 | 1 | |
| 677 | 1 |
| Value | Count | Frequency (%) |
| 99892 | 1 | |
| 99885 | 1 | |
| 99871 | 1 | |
| 99856 | 1 | |
| 99852 | 1 | |
| 99820 | 1 | |
| 99796 | 1 | |
| 99697 | 1 | |
| 99692 | 1 | |
| 99681 | 1 |
Interactions
Correlations
| customer_id | gender | marital_status | nin | postal_code | tax_id | |
|---|---|---|---|---|---|---|
| customer_id | 1.000 | 1.000 | 1.000 | -0.006 | -0.008 | 0.018 |
| gender | 1.000 | 1.000 | 0.000 | 0.000 | 0.025 | 1.000 |
| marital_status | 1.000 | 0.000 | 1.000 | 0.030 | 0.000 | 1.000 |
| nin | -0.006 | 0.000 | 0.030 | 1.000 | -0.012 | -0.008 |
| postal_code | -0.008 | 0.025 | 0.000 | -0.012 | 1.000 | -0.019 |
| tax_id | 0.018 | 1.000 | 1.000 | -0.008 | -0.019 | 1.000 |
Missing values
Sample
| customer_id | occupation | nin | passport | country_of_birth | marital_status | drivers_license | voters_card | tax_id | cac_number | gender | postal_code | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 89724233993 | Equality and diversity officer | 7.250263e+10 | XO6610220 | Malaysia | Single | U-ILUAQ44-1396390 | XPY/186/07/20045 | None | BN/6363282 | M | 08529 |
| 1 | 71294825978 | Occupational therapist | NaN | MK8799712 | American Samoa | Widowed | L-PRCHX61-1885792 | MSA/775/26/44500 | 4701744884 | BN/7765086 | F | 89203 |
| 2 | 99849097541 | Physiotherapist | 9.714721e+10 | None | Cook Islands | Married | O-XOJTJ30-0791028 | BKQ/568/35/91728 | 0328856397 | RC/7145711 | F | 08602 |
| 3 | 32795375109 | Psychologist, occupational | 5.301804e+10 | None | Mauritius | Married | P-ZEZIS45-1063384 | LIA/225/36/59688 | 5798081552 | None | F | 00661 |
| 4 | 96176222995 | Hydrologist | 1.569574e+10 | UZ2368406 | Zimbabwe | Single | None | None | 8799560272 | None | F | 88503 |
| 5 | 77364252258 | Stage manager | NaN | None | Korea | Single | X-LNUWH93-5717866 | VIC/056/69/83740 | 8988858393 | RC/7292868 | F | 79695 |
| 6 | 95739468736 | Scientist, research (life sciences) | 7.055348e+10 | F0385929 | French Polynesia | Widowed | C-FPBDD48-2527049 | SZI/197/59/72221 | None | BN/8508999 | F | 55126 |
| 7 | 42716803431 | Clinical molecular geneticist | NaN | None | Saint Vincent and the Grenadines | Married | X-EHLCX81-8541173 | AOW/423/33/46946 | 7737781422 | BN/2430902 | F | 32299 |
| 8 | 15808923190 | Scientist, water quality | 6.716493e+10 | V2133231 | El Salvador | Single | U-PINPB59-1051089 | CUT/648/23/54637 | 2912970628 | None | M | 69290 |
| 9 | 24574806522 | Therapist, art | NaN | None | Congo | Married | I-FEMSQ99-9704021 | PDC/305/83/57158 | 0934847368 | RC/7448566 | F | 87656 |
| customer_id | occupation | nin | passport | country_of_birth | marital_status | drivers_license | voters_card | tax_id | cac_number | gender | postal_code | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4990 | 75448158133 | Teacher, special educational needs | 6.205005e+10 | ZV7328849 | Belize | Single | None | None | 7760550675 | BN/6602019 | F | 73395 |
| 4991 | 75268145704 | Equality and diversity officer | NaN | R6441050 | Saint Vincent and the Grenadines | Married | K-FAVAY57-1167847 | IGV/602/21/04594 | None | None | M | 41207 |
| 4992 | 98299252818 | Armed forces technical officer | NaN | None | Fiji | Widowed | None | KWH/316/62/57114 | None | None | M | 14384 |
| 4993 | 45262454278 | Social worker | 3.970058e+10 | None | France | Widowed | None | MPF/994/60/90310 | 2112970651 | BN/6386428 | F | 02554 |
| 4994 | 12445037534 | Administrator, sports | 2.246730e+10 | None | Mauritania | Single | None | URC/321/89/25761 | 1663752262 | BN/9635642 | M | 09190 |
| 4995 | 15860174900 | Research scientist (maths) | 9.891724e+10 | None | Croatia | Single | V-CQSEK72-7551835 | PJI/175/56/88076 | 4825132141 | BN/9657759 | F | 52966 |
| 4996 | 31385632341 | Psychologist, forensic | 2.766576e+10 | HH3127708 | Latvia | Single | P-BWCJB76-0337214 | SBQ/576/29/30139 | 2589942052 | RC/4697396 | F | 59597 |
| 4997 | 69767870710 | English as a foreign language teacher | 2.066151e+10 | None | Saint Kitts and Nevis | Single | L-WGRPD35-9964033 | NEH/531/69/07777 | 6534524020 | RC/6967565 | F | 64136 |
| 4998 | 62468608105 | Barrister | 7.742233e+10 | None | Saint Lucia | Widowed | Q-DAXEY00-5486546 | None | 7264167475 | BN/6262542 | M | 71869 |
| 4999 | 42474098214 | Health visitor | 3.549547e+10 | FZ8762915 | United Kingdom | Married | G-TIBMW62-0124660 | None | 8913371353 | RC/9169873 | M | 53553 |